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Because of growing interest in temperature-based sampling methods like 
replica exchange [1-7], this note aims to make some observations and raise 
some potentially important questions which we have not seen addressed suf- 
ficiently in the literature. Mainly, we wish to call attention to limits on the 
maximum speed-up to be expected from temperature-based methods, and 
also note the need for careful quantification of sampling efficiency Because 
potentially lengthy studies may be necessary to address these issues, we felt 
it would be useful to bring them to the attention of the broader community. 
Here we are strictly concerned with canonical sampling at a fixed temperature, 
and not with conformational search. 

We will base our discussion around a generic replica exchange protocol, 
consisting of M levels spanning from the temperature T at which canonical 
sampling is desired, up to Tm- The protocol is motivated by the increased 
rate of barrier crossing possible at higher temperatures. We assume each level 
is simulated for a time £ s i m , which implies a total CPU cost (M + 1) x i s j m . 
In typical explicitly solvated peptide systems, M ~ 20, T ~ 300K and 
T M ~ 450K [3] [check temp]. The relatively low T M values reflect the well- 
known, high sensitivity of the approach to configuration-space overlap in 
large systems [2, 3]: that is, because of minimal overlap, typical configurations 
in high T Boltzmann ensembles are unlikely in low T enembles. We note that 
a new exchange variant introduced by Berne and coworkers permits the use 
of "cold" solvent and larger temperature gaps [8], but the issues we raise still 
apply to the new protocol, especially as larger solutes are considered. 

While replica exchange is often thought of as an 'enhanced sampling 
method,' what does that mean? Indeed, what is an appropriate criterion 
for judging efficiency? As our first observation, we believe (Obs. I) effi- 
ciency can only mean a decrease in the total CPU usage — i.e., summed 
over all processors — for a given degree of sampling quality. (We will defer 
the necessary discussion of assessing sampling quality, and only assume such 
assessment is possible.) When the goal is canonical sampling at T , after 
all, one has the option of running an ordinary parallel simulation at T (e.g., 
[namd]) or even M independent simulations [9]. A truly efficient method 
must be a superior alternative to such "brute force" simulation. 

(Obs. II) Reports in the literature offer an ambiguous picture as to 
whether replica exchange attains efficiency for canonical sampling. Sanbon- 
matsu and Garcia compared replica exchange to an equivalent amount of 
brute-force sampling, but their claim of efficiency is largely based on the 
alternative goal of enhancing sampling over the full range of temperatures, 
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rather than for canonical sampling at T [3]. When the data solely for T 
are examined, there is no clear gain, especially noting that assessment was 
based on principal components derived only from the replica exchange data. 
Another claim of efficiency, by Duan and coworkers [7], fails to include the 
full CPU cost of all M levels. When suitably corrected, there does appear 
to be speedup of perhaps a factor of two for T = 308K, but the system 
studied is considerably smaller (permitting larger temperature jumps) than 
would be possible in protein systems of interest. Another efficiency claim by 
Roe et al. also does not account for the full CPU cost of all ladder levels [6]. 
In a structural-glass system, replica exchange was found not to be helpful 
[10], although efficiency has been noted in spin-systems [1, 11]. We emphasize 
that biomolecular replica exchange should indeed be efficient in certain cases 
(with high enough energy barriers; see below). At least one such instance 
has been noted by Garcia, using a suitable brute-force comparison system 
[12]. 

The lack of clear-cut results in a much-heralded approach merit closer 
examination. What might be preventing efficiency gain? Or put another 
way, what is the maximum efficiency possible in a standard replica exchange 
simulation? The very construction of the method implies that (Obs. Ill) in 
any parallel exchange protocol, the sampling "speed" at the bottom level - 
lowest T — will be controlled by the speed at which the top level — highest 
T — samples the necessary space. Further, given our interest in efficient 
canonical sampling at T , the speed of the top level should exceed that of 
the bottom by at least a factor of M. If not, the simulation does not "break 
even" in total CPU cost, as compared to brute-force canonical sampling at 
T for the full length M x t gim . 

The basic temperature dependence of barrier-crossing is well known (e.g., 
[13]) and has important consequences for replica exchange. The Arrhenius 
factor indicates that the temperature-dependent rate k for crossing a partic- 
ular barrier obeys 

k a (T) = k exp (+AS/k B ) exp (-AE/k B T) (1) 

for a fixed-volume system, where ko is an unknown prefactor insensitive to 
temperature and assumed constant; AE is the energy barrier and AS is the 
entropy barrier — i.e., "narrowing" of configuration space — which must 
be expected in a mult i- dimensional molecular system. Two observations are 
immediate: (Obs. IV) the entropic component of the rate is completely 
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unaffected by an increase in temperature, and the possible speedup due to 
the energetic part can easily be calculated. 

The table gives possible speedups for several energy barriers and temper- 
atures, employing units of /cbT for T = 300K. Speed-ups are computed 
simply as the ratio k a (T M ) / k a (T ) for possible values of T M - It is clear that 
for modest barriers, the speed-up attainable even with a top temperature 
Tm = 500K is only of the order of a typical number of replicas in replica ex- 
change, M ~ 20. Thus, (Obs. V) if modest barriers (< 8/cbTo) dominate a 
system's dynamics, efficiency will be difficult to obtain via replica exchange, 
since the speed-up noted in the table needs to be divided by M + 1. 

How high are barriers encountered in molecular systems? We can only 
begin to answer this question, but one must first be careful about which 
barriers matter. We believe that (Obs. VI) "local" barriers will matter most: 
that is, the energy barriers actually encountered in a trajectory will dominate 
sampling speed. Apparent barriers determined by projections onto arbitrary 
low-dimensional reaction coordinates would seem of uncertain value. (We 
note that Zwanzig has attempted to account for local roughness with an 
effective diffusion constant on a slowly varying landscape [14].) 

Evidence from simulations and experiments is far from complete, but 
indicates that (Obs. VII) energy barriers in molecular systems appear to 
be modest. Here, unless noted otherwise, T ~ 300K. In their extensive 
study of a tetrapeptide, Czerminski and Elber found barriers < 3 kcal/mole 
~ 5ksT for the lowest energy transition path [15]. Equally interesting, 
they found approximately 1,000 additional paths with similar energy profiles 
(differing by < 1 kcal/mole < 2A;bT ) — suggesting what we might term a 
"pebbly" rather than "mountainous" energy landscape. See also Ref. [16]. 
In our own work (unpublished) with implicitly solvated dileucine, increasing 
the temperature from 298K to 500K led to a hopping-rate increase of a 
factor of 1.8, suggesting a small barrier (< 1.5/cbTo). Similarly, Sanbonmatsu 
and Garcia found that barriers for explicitly solvated met-enkephalin were 
small, on the order of k B T [3]. An experimental study has also suggested 
barriers are modest (< 6A;bT ) [17]. Although this list is fairly compelling, 
we believe the question of barrier heights is far from settled. Further study 
should carefully consider local vs. global barriers, as well as entropy vs. energy 
components of barriers. (We purposely do not discuss barriers to protein 
folding, because our scope here is solely equilibrium fluctuations.) 

Finally, the goal of understanding efficiency implies the need for reliable 
means for assessing sampling. An ideal approach to assessment would survey 
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all pertinent substates to ensure appropriate Boltzmann frequencies. Present 
approaches to assessment typically calculate free energy surfaces (equiva- 
lently, probability distributions) on one or two-dimensional surfaces, which 
are evaluated visually. Principal components (e.g., [3, 7]) as well as "compos- 
ite" coordinates like the radius of gyration [6] are popular coordinate choices. 
Yet we believe that (Obs. VIII) the use of low-dimensional sampling assess- 
ment is intrinsically limited, since it could readily mask structural diversity 
- i.e., be consistent with substantially distinct conformational ensembles. 
Future work could usefully pursue higher- dimensional measures, which can 
always be numerically compared between independent simulations for sam- 
pling assessment. In our own work, for instance, we have begun to use a 
histogram measure which directly reports on the structural distribution of 
an ensemble [18]. 

In conclusion, we have attempted to tie together a number of straight- 
forward observations which reflect concerns about the effectiveness of the 
replica exchange simulation method, when the goal is single-temperature 
canonical sampling. The concerns suggest other simulation strategies, such 
as Hamiltonian exchange [19] and resolution exchange [20,21], may merit 
consideration — as well as scrutiny. We emphasize that our goal has been to 
raise questions more than to answer them. Even if our worries turn out to 
be exaggerated, a candid discussion of the issues should be beneficial to the 
molecular simulation community. 
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Garcia, and Robert Swendsen for very useful conversations. We gratefully ac- 
knowledge support from the NIH, through Grants ES007318 and GM070987. 
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AE = 2k B T 


4k B T 


6k B T 


8A; B T 




1.65 


2.72 


4.48 


7.39 


500K 


2.23 


4.95 


11.0 


24.5 


QOOK 


2.72 


7.39 


20.1 


54.6 



Table 1: High-temperature speed-up factors calculated using Arrhenius fac- 
tors. Speed-up factors are computed as the ratio k a (TM)/k a (T = 300K) for 
the indicated energy barriers AE via Eq. (1). Energy barriers are given in 
units of k B T . A rough estimate of the efficiency factor (the factor by which 
total CPU usage is reduced) obtainable in an M-level parallel replica ex- 
change simulation with maximum temperature T M is the table entry divided 
M. 
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